Crosslingual Countability Classification with EuroWordNet
نویسندگان
چکیده
We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method involves determining a default set of countabilities for each EuroWordNet synset based on countability-mapped words in that synset, and testing the match between these countabilities and those of held-out words. As EuroWordNet provides crosslingual synset correspondences between Dutch and English, we are able to evaluate the method both monolingually for Dutch and English, and crosslingually between the two languages. We found that Dutch and English countabilities align as well cross-lingually as they do monolingually.
منابع مشابه
The Ins and Outs of Dutch noun countability classification
This paper presents a range of methods for classifying Dutch noun countability based on either Dutch or English data. The classification is founded on translational equivalences and the corpus analysis of linguistic features which correlate with particular countability classes. We show that crosslingual classification on the basis of word-to-word or featureto-feature mappings between English an...
متن کاملCrosslingual Countability Classification: English meets Dutch
This paper presents a range of methods for classifying Dutch nouns as countable, uncountable or plural only based on both Dutch and English data. The classification is based on the occurrence of countability specific linguistic features that are extracted from unannotated corpora. We show that in the absence of reliable Dutch gold standard data, cross-linguistic classification can be achieved o...
متن کاملMultilingual Training of Crosslingual Word Embeddings
Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine infor...
متن کاملReinforcing English Countability Prediction with One Countability per Discourse Property
Countability of English nouns is important in various natural language processing tasks. It especially plays an important role in machine translation since it determines the range of possible determiners. This paper proposes a method for reinforcing countability prediction by introducing a novel concept called one countability per discourse. It claims that when a noun appears more than once in ...
متن کاملA sense-based lexicon of count and mass expressions: The Bochum English Countability Lexicon
The present paper describes the current release of the Bochum English Countability Lexicon (BECL 2.1), a large empirical database consisting of lemmata from Open ANC (http://www.anc.org) with added senses from WordNet (Fellbaum, 1998). BECL 2.1 contains ≈ 11,800 annotated noun-sense pairs, divided into four major countability classes and 18 fine-grained subclasses. In the current version, BECL ...
متن کامل